efg, 2017-09-02
time.1 <- Sys.time()
Required packages
library(MASS) # fgl data
library(caret) # preProcess, predict
library(dplyr) # select
library(rgl) # par3d, plot3d, movie3d, rglwidget
library(RColorBrewer) # brewer.pal
magick from ImageMagick must be installed to created the animated GIF of the PCA.
Forensic Glass Data
rawData <- fgl
typeColorIndex <- as.integer(rawData$type)
rawData <- rawData %>% select(-type)
Let’s display the first 3 principal components in a 3D scatterplot
nPCAcomponents <- 3
transformSetup <- preProcess(rawData, method=c("center", "scale", "pca"), pcaComp=nPCAcomponents)
pcaScores <- predict(transformSetup, rawData)
head(pcaScores)
PC1 PC2 PC3
1 -1.1484468 -0.5282491 0.3712253
2 0.5727942 -0.7580105 0.5554059
3 0.9379605 -0.9276609 0.5536094
4 0.1417509 -0.9594279 0.1168507
5 0.3502710 -1.0886966 0.4839440
6 0.2895876 -1.3209105 -0.8666466
You can verify preProcess gives the same PCAscores as in the SVD notebook.
The first 3 PCs account for about 66% of variance in data.
typeColors <- brewer.pal(length(levels(fgl$type)), "Dark2")
par3d("windowRect"=c(50,50,800,800))
plot3d(x=pcaScores$PC1, y=pcaScores$PC2, z=pcaScores$PC3,
col=typeColors[typeColorIndex],
xlab="PC1", ylab="PC2", zlab="PC3", type="s", size= 3)
rglwidget(elementId="FGL1")
Chrome browser works best to display above figure.
Drag mouse over figure to rotate. Use mouse wheel to zoom in and out.
x <- barplot(rep(1,6), yaxt="n", col=typeColors)
text(x, 0.5, levels(fgl$type))
Automatically rotate for about 15 seconds when created.
Note the “Home” instances form a fairly good cluster, but the other types not so much.
play3d(spin3d(), duration=15)
Create the animated GIF movie using magick from ImageMagick – this takes some time. Display below using HTML.
150 PNG images will be computed for 15 sec duration * 10 frames/second.
movie3d(spin3d(), duration = 15, dir = getwd(),
movie="ForensicGlass",
verbose=FALSE, convert="magick -delay 1x%d %s*.png %s.%s")
Here’s the HTML needed in the R Markdown document to embed the GIF into the HTML file created with knitr.
<div id="PCA">
<img src="ForensicGlass.gif" alt="">
</div>
Processing time: 54.4 sec
2017-09-07 22:09:50
Practical Guide to Principal Component Analysis (PCA) in R & Python from Analytics Vidhya, 2016.
Computing and visualizing PCA in R by Thiago G. Martins, 2013.
Introduction to Principal Component Analysis (PCA) by Thiago G. Martins, 2013.
Principal Components Analysis notes from class given by Brian Junker and Cosma Shalizi at CMU, 2010.
Principal Components Analysis: A How-To Manual for R by Emily Mankin. Includes
two PCA principles (pp. 3-4) and four major assumptions (p. 12),
“do it yourself” method (p. 6)
Using Built-In R Functions (p. 7)
Naive Principal Component Analysis in R, Data Science Central, Pablo Bernabeu, 2017.